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Abstract 

The  empirical  mode  decomposition  (EMD)  is  an  algorithm  pioneered  by  N.  Huang  et.  al.  as 
an  alternative  technique  to  the  traditional  Fourier  and  wavelet  methods  for  analyzing  nonlinear 
and  non-stationary  signals.  It  aims  at  decomposing  a  signal,  via  an  iterative  sifting  procedure, 
into  several  intrinsic  mode  functions  (IMFs),  and  each  of  the  IMFs  has  better  behaved  instan¬ 
taneous  frequency  analysis.  This  paper  presents  an  alternative  approach  for  FMD.  The  main 
idea  is  to  replace  the  average  of  upper  and  lower  envelopes  in  the  sifting  procedure  of  FMD 
by  a  local  average  obtained  by  variational  optimization  framework.  Therefore,  an  IMF  can  be 
produced  by  simply  subtracting  the  average  from  the  signal  without  iteration.  Our  numerical 

examples  illustrate  that  the  resulting  decomposition  is  convergent  and  robust  against  noise. 
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1  Introduction 

It  is  well  known  that  nonlinear  and  non-stationary  signal  analysis  is  important  and  difficult.  His¬ 
torically,  Fourier  transform  has  provided  a  general  method  for  analyzing  signals,  and  has  achieved 
unprecedented  success  for  signals  generated  by  linear  and  stationary  processes.  For  nonlinear  and 
non-stationary  signals,  there  are  many  methods  too.  For  examples,  windowed  Fourier  transform, 
wavelet  transform  and  the  Wigner-Ville  distribution  are  designed  for  non-stationary  but  linear  sig¬ 
nals.  A  number  of  nonlinear  time  series  analysis  methods,  almost  all  of  them  are  based  on  Fourier 
analysis,  have  been  studied  for  nonlinear  but  stationary  systems  [16,  24,  20].  Despite  some  re¬ 
markable  performances  achieved  by  those  methods  in  various  applications,  analyzing  nonlinear  and 
non-stationary  signals  still  remains  challenging. 

The  empirical  mode  decomposition  (EMD),  first  proposed  by  Huang  et.  al.  in  1998,  is  a  highly 
adaptive  scheme  serving  as  a  powerful  complement  to  the  existing  approaches  [11]  .  The  main 
idea  is  that  any  complicated  data  set  can  be  decomposed  into  a  finite,  and  often  small,  number 
of  intrinsic  mode  functions  (IMFs)  via  the  so-called  sifting  procedure,  where  an  IMF  is  a  function 
that  satisfies  the  following  two  conditions  quoted  directly  from  [11]: 

(1)  In  the  whole  data  set,  the  number  of  extrema  and  the  number  of  zero  crossings  must  either 
equal  or  differ  at  most  by  one; 

(2)  At  any  point,  the  mean  value  of  the  envelope  defined  by  the  local  maxima  and  the  envelope 
defined  by  the  local  minima  is  zero. 

In  recent  years,  the  EMD  has  received  considerable  attentions  in  terms  of  interpretations  ([5,  3, 
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23,  18])  and  applications  in  many  disciplines  such  as  ocean  science  [10],  biomedicine  ([14,  21,  32]), 
speech  signal  processing  [28],  image  processing  [1],  pattern  recognition  ([19,  30,  33])  and  many 
more. 

Despite  its  remarkable  success,  it  is  still  lacking  of  mathematical  understanding  of  the  EMD 
method,  such  as  its  convergence  property,  dependence  on  the  number  of  sifting,  the  stopping  cri¬ 
teria,  and  its  stability  to  noise  perturbations.  Those  shortcomings  are  partly  due  to  the  highly 
adaptive  nature  of  the  sifting  procedure  as  well  as  the  ad  hoc  nature  of  using  cubic  splines  or  B- 
splines  [3] .  This  sparkles  many  studies  to  provide  mathematically  sounding  alternatives  to  the  EMD 
method.  For  examples,  in  [4],  Daubechies  et.  al.  proposed  a  framework,  named  synchrosqueesed 
wavelet  transform,  that  combines  multiresolution  analysis,  amplitude  modulated-frequency  mod¬ 
ulated  (AM-FM)  signals,  and  shape  functions  to  achieve  signal  decompositions.  Hou  et.  al.  [7] 
proposed  a  variational  approach,  based  on  AM-FM  functions  and  higher  order  total  variation  norms 
to  achieve  data  dependent  sparse  signal  decompositions.  Lin  et.  al.  [17]  designed  adaptive  iterative 
filters  to  compute  the  local  average,  which  is  then  used  to  replace  the  average  of  upper  and  lower 
envelopes  in  the  sifting  procedure  in  the  EMD  method.  In  [8,  29,  15],  Peng  et.  al.  developed 
a  local  linear  operator-based  method,  which  decomposes  a  signal  into  summation  of  local  narrow 
band  signals.  And  Pustelnik  et.  al.  [9,  27]  proposed  a  non-smooth  convex  optimization  approach 
to  calculate  multi-components  of  the  signal  so  that  each  one  can  be  considered  as  an  IMF.  In  their 
optimization  model,  a  constrain  is  used  to  enforce  near  orthogonality  among  the  components.  In 
addition,  many  groups  have  conducted  researches  to  improve  the  EMD  method  [22,  25]. 

In  this  paper,  we  present  a  new  mathematical  framework  serving  as  another  alternative,  which 
not  only  has  closer  ties  to  the  original  EMD  method,  i.e.,  its  components  can  be  guaranteed  as 
IMFs,  but  also  can  resolve  some  aforementioned  mathematical  challenges  faced  by  the  original  EMD 
method.  Our  approach  is  motivated  by  a  recent  theoretical  study  [31]  showing  that  condition  (2)  in 
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Huang’s  original  IMF  descriptions  implies  condition  (1)  provided  that  in  the  sifting  procedure,  the 
upper  and  lower  envelopes  are  obtained  by  spline  interpolations  of  the  local  maxima  and  minima 
respectively.  Inspired  by  this  result,  and  the  fact  that  the  sifting  procedure  is  used  to  generate 
IMFs  with  zero  local  averages,  we  propose  to  use  local  variational  optimization  models  to  find  local 
averages  directly,  and  then  the  local  averages  are  subtracted  from  the  signal  to  generate  IMFs.  In 
other  words,  we  use  local  optimization  models  to  replace  the  sifting  procedure  in  the  EMD  method. 

In  the  design  of  our  new  framework,  we  take  several  factors  into  consideration.  First,  we  want 
our  method  to  be  flexible,  so  we  do  not  use  any  pre-defined  functions,  such  as  AM-FM  functions 
or  wavelets  as  the  bases  for  the  IMFs.  This  is  a  feature  shared  by  the  original  EMD  method  and 
the  non-smooth  optimization  approach  taken  by  Pustelnik  et.  al  [9,  27].  On  the  other  hand,  many 
existing  alternatives,  such  as  the  methods  studied  in[4,  7,  8,  29,  15],  use  pre-dehned  functions  as 
bases  for  IMFs.  Second,  we  would  like  to  design  our  method  adaptively  and  locally,  so  we  partition 
the  region  into  subregions  separated  by  local  extreme  points  and  then  solve  local  optimization 
problems  for  the  local  averages.  This  ensures  that  information  from  regions  far  apart  does  not 
influence  each  other.  And  we  like  to  stress  that  this  is  a  distinguished  feature  in  our  framework 
in  response  to  the  nature  of  nonlinear  and  non-stationary  signals.  To  the  best  of  our  knowledge, 
in  other  existing  optimization  based  decompositions,  such  as  the  ones  in  [7,  8,  29,  15,  9,  27], 
the  objective  functions  are  globally  defined.  Third,  considering  the  possibility  that  the  intrinsic 
components  in  a  signal  may  not  be  orthogonal  to  each  other,  we  do  not  use  any  conditions  to 
achieve  near  orthogonality  in  the  decomposition.  This  is  different  from  some  existing  methods 
in  the  literatures,  such  as  the  ones  proposed  in  [9,  27],  in  which  near  orthogonality  is  enforced. 
However,  numerical  examples  indicate  that  our  approach  exhibits  data  dependent  adaptivity  in 
producing  orthogonal  components,  meaning  it  generates  near  orthogonal  decompositions  if  the 
intrinsic  components  are  near  orthogonal,  otherwise  it  does  not  necessarily  decompose  orthogonal 
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components.  Finally,  we  also  want  to  design  our  strategy  to  be  robust  against  noise  perturbations. 
For  this  reason,  we  use  both  total  variation  (TV)  and  high  order  TV  norms,  all  defined  on  the 
subregions  to  avoid  global  influence,  to  account  for  the  noise  perturbations  as  well  as  possible 
discontinuities  in  the  signal.  On  the  contrary,  only  higher  order  TV  norms  are  used  in  the  existing 
methods.  In  computation,  we  employ  the  recent  developed  split  Bregmen  iteration  [34,  6,  2]  to 
compute  the  solutions  of  the  local  optimization  problems  efficiently.  Our  theoretical  study  and 
numerical  experiments  show  that  the  proposed  method  can  achieve  the  desirable  features  with 
convergent,  efficient,  and  robust  against  noise  properties. 

The  rest  of  this  paper  is  organized  as  follows.  We  introduce  our  optimization  model  for  obtaining 
the  local  average,  and  provide  some  details  of  its  implementation  issues  in  Section  2.  In  Section  3, 
we  present  an  alternative  approach  for  the  EMD  method  in  data  decomposition.  In  addition,  we 
apply  discontinuous  signal  and  monocomponent  data  to  test  the  performance  of  our  approach.  We 
also  demonstrate  its  ability  in  data  decomposition,  which  cover  simulative,  real  and  noisy  data.  We 
present  our  conclusion  in  Section  4. 

2  The  novel  local  average  based  on  optimization  for  data  decom¬ 
position 

A  key  idea  in  the  EMD  method  is  to  find  a  locally  determined  curve  (or  average)  so  that  the 
difference  between  the  signal  and  average  is  symmetric  as  described  by  the  two  conditions  of  IMFs 
given  in  [II].  In  the  original  EMD  algorithm,  the  curve  is  obtained  by  the  average  of  upper  and 
lower  envelopes  calculated  by  cubic  splines  through  the  extreme  points.  However,  overshoots  and 
undershoots  are  common,  it  can  generate  new  extrema,  shift  or  exaggerate  the  existing  ones.  The 
produced  difference  usually  does  not  satisfy  the  symmetry  conditions.  Hence,  the  EMD  algorithm 
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use  repeated  sifting  procedure  to  achieve  the  goal.  Inspired  by  this  idea,  we  start  our  investigation 


by  finding  another  local  average  procedure  to  replace  the  sifting  process. 

Let  X  be  a  signal  ,  the  usual  average  of  x  defined  on  [a,  b]  is  given  by  the  following  formula 


1 


X  = 


b  —  a 


x{t)dt, 


(2.1) 


or  a  local  weighted  average  which  will  not  be  discussed  in  this  paper.  But  for  the  data  x  covering 
serval  oscillations  on  [a,  6],  the  average  by  (2.1)  can  not  achieve  our  purpose  mentioned  above.  To 
vividly  illustrate  it,  we  consider  a  toy  example 


x{t)  =  6t  +  cos(87rt),  t  G  [0, 1].  (2-2) 

The  left  of  Figure  1  gives  its  profile,  the  average  computed  by  (2.1),  and  the  ideal  average  we  are 
looking  for.  From  it,  the  red  line,  i.e.,  the  average  computed  by  (2.1)  clearly  can  not  be  the  average 
we  want.  This  is  because  x  contains  several  varying  vibrations,  but  the  red  line  does  not  follow 
the  featured  oscillations.  Inspired  by  these,  to  compute  the  average  of  any  oscillating  signal,  we 
should  partition  it  into  several  monotonic  parts  in  advance.  However,  directly  applying  (2.1)  on  the 
monotonic  intervals  creates  the  staircase  problem,  as  the  red  curve  depicted  in  the  right  of  Figure 
1,  various  jump  points  arise  on  it.  Motivated  by  those  observations,  the  average  we  are  looking  for 
not  only  should  be  adaptivity  and  locality,  but  also  smoothness. 

Let  S[a,  b]  be  the  space  of  all  the  continuous  and  piecewise  monotonic  functions.  That  is,  for 
each  x  G  5[a,  6],  the  domain  [a,b]  can  be  partitioned  into  a  =  af  <  af  <  •  •  •  <  =  b  such 

that  X  is  monotonic  on  each  Bf  =  [af ,  for  f  =  1,  •  •  •  ,  m^,  where  the  superscripts  of  x  express 
the  dependence  of  the  partition  on  x.  We  denote  by  M.  ;  S[a,  b]  — )■  S[a,  b]  the  average  operator  we 
look  for.  Then,  for  each  x  G  S[a,  b],  we  would  like  Mx  to  be  a  local,  adaptive  and  smooth  function 
achieving  the  following  goals: 
goal  1:  fn^(x  —  A4x)dt  =  0  for  i  =  1, . . . , 
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Figure  1:  Left,  blue:  x,  red:  the  average  of  x  obtained  from  (2.1),  green:  the  average  we  look  for;  Right, 
blue:  X,  red:  the  average  given  by  utilizing  (2.1)  for  each  monotonic  part  of  x,  green:  the  average  we  look 
for. 

goal  2:  the  average  of  x—Aix  should  be  zero,  i.e.,  M.{x—M.x)  =  0.  Here  it  needs  to  be  emphasized 
that  the  average  operator  of  x  is  different  from  that  of  x  —  A4x; 
goal  3:  X  —  Aix  is  monotone  on  each  Bf,  i  =  1, . . . ,  . 

It  must  be  pointed  out  that  the  operator  M.  is  data  dependent.  That  is,  different  data  correspond 
to  different  average  operators.  Notice  that,  the  first  goal  is  to  make  Aix  be  a  local  average;  and 
the  second  one  aims  at  ensuring  x  —  Aix  to  be  symmetric  with  respect  to  zero;  the  last  goal  is  to 
guarantee  there  is  no  new  extrema  appeared  in  x  —  A4x.  We  select  optimization  model  to  achieve 
the  goals,  which  is  given  in  the  following  subsection. 

2.1  Adaptive  and  local  average  based  on  TV  and  TV^  norms 

For  a  function  g{t),t  G  [a,  6],  a  common  way  to  measure  its  smoothness  is  through  the  r-order  TV 
norm 

TV"(<?)  =  f  \g^^^^\t)\dt,  T  G  NU{0}.  (2.3) 

J  a 

However,  in  order  to  ensure  that  the  information  from  regions  far  apart  does  not  influence  each 
other,  we  confine  the  TV’^  norm  to  characterize  the  smoothness  separately  on  each  partitioned 
monotone  subinterval  instead  of  the  whole  domain. 
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For  any  x  G  S[a,  b],  {Bf  =  [af,  af+i]}™^i  denotes  its  local  monotone  subintervals.  Let  be  the 
set  of  all  the  functions  defined  on  [a,  b]  satisfying  the  following  conditions: 

(i)  Jj^^{x-g)dt  =  0,i  =  l,...,m^; 

i 

(ii)  X  —  g  \s  monotone  on  Bf,  i  =  1, . . . ,  m^; 

(hi)  g{af)  =  /g.  x{t)dt,  i  =  I, . . .  ,nf  +  I,  where  Bf  :=  :=  0. 

We  define  the  average  Mx  as  the  solution  of  the  following  optimization  problem; 

min  aiTV°(c/) +  a2TV^(5),  (2.4) 

g&Tj; 

where  r  G  N+,  ai,  02  are  two  positive  penalty  parameters  satisfying  ai  +  02  =  1- 

Notice  the  condition  (iii)  in  T^,  it  makes  the  unknown  function  g  be  piecewise  separable  in 
problem  (2.4).  Hence,  minimizing  these  total  variation  norms  of  x  is  intrinsically  imposed  on  each 
Bf  independently.  Furthermore,  taking  both  the  0-order  and  r-order  total  variation  norms  in  the 
objective  are  resulted  from  the  following  reasons.  On  one  hand,  the  0-order  total  variation  norm  is 
to  account  for  those  nonlinear  and  non-stationary  data  containing  jump  points.  And  on  the  other 
hand,  using  the  r-order  total  variation  norm  is  to  ensure  the  smoothness  of  the  average.  Hence, 
to  make  the  average  smooth,  the  parameter  02  takes  a  larger  value  than  ai.  When  the  signals  are 
discontinuous,  we  use  comparable  values  for  both  ai  and  02 • 

According  to  the  conditions  in  T^,  it  is  easy  to  find  that  Aix  satisfies  both  the  goal  1  and  goal 
3.  Also,  we  will  prove  that  it  meets  the  goal  2  through  the  following  theorem. 

Theorem  1.  Let  x  G  5[a,  6]  be  the  signal  to  be  analyzed.  If  the  average  M.x  is  generated  from  the 
optimization  problem  (2-4),  then  there  holds  Ai(x  —  Aix)  =  0. 

Proof  Let  Bf  =  [af,  af_^_l\,  i  =  1, ,  m^,  be  the  intervals  on  which  x  is  monotone.  Then  [a,  b]  = 
yJffZiBf .  Consider  the  optimization  problem  (2.4),  for  any  g  ^  Tx,  let  xi  :=  x  —  g.  Denote 


i  =  1, . . .  ,m^^,  as  the  intervals  on  which  xi  is  monotone.  Since  xi  is  monotone 
on  each  Bf ,  it  is  followed  that  each  Bp  is  a  union  of  some  Bf,  that  is,  Bp  =  Uji^AiBp  where  Aj 
is  a  subset  of  {1, ,  m^}.  Thus, 

xi{t)dt  =  /  {x{t)  —  g{t))dt  =  0,  z  =  l,...,m*h 

It  is  followed  immediately  that 

/  xi{t)dt  =  Q,  i  =  1, . . . ,  +  1, 

JbP.ubP 

where  Bp  =  Bp^i^i  :=  0.  At  last,  xi  is  naturally  monotonic  on  each  Bp.  These  imply  that 

0  G  Tj;^. 

Let  gi  G  be  a  solution  of  (2.4).  Then 

aiTV°(5i)  +  a2TV^(5i)  <  aiTV°(0)  +  a2TV^(0)  =  0.  (2-5) 

Notice  that  ai,  02  are  positive,  it  concludes  that  gi  equals  a  constant  on  [a,  b].  Using  the  condition 
gi{a)  =  xi{t)dt  =  0,  we  have  that  gi{t)  =  0,  Vt  G  [a,  b].  Hence,  it  yields  Ai(x  —  Aix)  =  0.  □ 

For  Theorem  1,  it  indicates  that  our  average  can  make  the  residue  which  is  the  average  sub¬ 
tracted  from  the  original  signal,  be  symmetric  with  zero.  That  is,  it  meets  the  condition  (2)  of  IMF 
in  our  average  sense.  Furthermore,  similar  to  the  analysis  in  [31],  we  have  the  following  conclusion. 

Corollary  1.  Let  x  G  5[a,  6],  and  be  the  set  defined  in  problem  (2.4)-  Then  for  any  continuous 
g  ^  Tx,  the  function  x  —  g  has  a  single  zero  crossing  on  each  of  its  strictly  monotone  interval. 

Proof.  Let  Bf  =  [af.,apf\.,  i  =  1, . . .  ,m*,  be  the  intervals  on  which  x  is  monotone  and  of,  i  = 
1, . . . ,  m^,  be  extreme  points  of  x  satisfying  af  ap-^  for  i  =  1, . . . ,  m*.  Due  to  5^  G  T^,,  we  have 

i)  X  —  g  is  monotonic  on  Bf  and  Px(x  —  g)dt  =  0,  i  =  1, . . . ,  m®; 

ii)  (x  —  g)\t=af  =  ,ub^  “  x{t)]dt,  i  =  1, . . .  ,m  4-1. 

I  2  —  1  I  '  I  2  I  2  —  1  2 
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Without  loss  of  generality,  suppose  that  x  is  increasing  on  Bf .  Then  x(af)  —  x{t)  <  0,  Vt  G 
Bf_^  U  Bf,  and  x{af)  —  <  0.  It  is  immediately  followed  that  (x  —  g)\t=a^  <  0.  Similarly  we 

have  (x  —  >  0.  Since  x  —  is  monotone  on  Bf ,  it  is  clear  to  find  that  x  —  g  is  increasing 

on  Bf  as  x.  Hence,  x  —  g  has  the  same  monotonicity  with  x. 

Moreover,  suppose  that  Bf  is  a  strictly  increasing  interval  of  x  —  Since  x  —  g  is  continuous 
and  satisfies  (x  —  g)\t=a^  <  0  and  (x  —  >  0,  we  clearly  get  it  has  a  single  zero  crossing  on 

Bf .  The  conclusion  is  achieved.  □ 


Remark  1.  In  the  condition  (Hi)  of  Tx,  there  are  many  ways  to  define  the  values  of  g(af),i  = 
1, . . . ,  +  1,  such  as 


(1)  9(af) 


^(af-l)+2x(af)+x(af_^j^) 

4  ' 


where  x(aQ) 


0  and  x(a^^_,_2)  :=  0; 


(2)  g{af)  =  Y)j  !b-  ^(t)dt; 

(3)  g{a^ )  =  rg^— rprgsT  m_b^  x{t)dt. 

We  adopt  the  method  (3)  in  this  paper.  This  is  because  it  has  the  following  advantages  over  the 
others  listed  above.  First,  in  contrast  to  (1),  it  fully  uses  the  information  on  monotonic  intervals 
of  X  rather  than  just  the  values  on  the  three  extreme  points,  this  makes  our  method  much  better 
in  grasping  the  trend  of  x.  Second,  the  points  used  in  (3)  are  just  referred  to  the  local  monotonic 
intervals,  which  is  different  to  the  method  in  (2)  from  which  the  values  are  defined  by  the  weighted 
average  of  the  global  information  of  x,  this  makes  our  method  better  in  avoiding  the  global  influence 
and  robust  to  the  noise  perturbations. 


In  the  following  subsection,  we  propose  the  detail  implementation  for  obtaining  the  numerical 
solution  of  the  minimization  problem  (2.4). 
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2.2  Equivalent  formula  and  numerical  implementation 

To  compute  the  solution  of  the  model  (2.4),  the  major  challenge  is  to  make  it  satisfies  the  condition 
(ii)  in  Tx-  We  introduce  the  following  proposition,  which  can  not  only  give  an  equivalent  description 
of  this  condition,  and  also  provide  an  executable  formula  to  be  incorporated  in  computation. 


Proposition  1.  If  f  is  the  solution  of  the  following  optimization  problem 

min/eS[a,6]nciKfe]  ^V^if) 

s.t.  f{a)  =  PiJib)  =  (I2, 

where  /3i,/?2  are  two  given  constants,  then  f  is  monotonic  on  [a,b]. 


Proof.  It  is  easy  to  see  that  any  monotonic  function  in  C^[a,  b]  is  a  solution  of  (??)  and  the  minimal 
value  of  the  objective  functional  is  |/32  —  /3i|.  Since 

fb  \  fb  I 


TV^if) 


\fit)\dt  > 


/  f'{t)dt 


1/(6) -/(a)|  =  1/32 -/3i|=TV°(/), 


J  a 


J  a 


we  conclude  that  f'{t)  =  A|/'(t)|,  Vt  G  [a,  6],  where  A  is  the  sign  of  (32  —  (di-  Thus  /  is  monotonic 
on  [a,  6].  □ 


Following  Proposition  1,  the  condition  (ii)  in  Tx  can  be  expressed  into  a  minimization  problem. 
Thus,  Tx  can  be  equivalently  reformed  as  follows: 


(i)  f^x{x  —  g)dt  =  0  for  i  =  1, . . .  ,m®; 

(ii)  and  g  is  the  solution  of  the  minimization  problem 

min^  TV°(x  — 5) 

s.t.  (/(Uj  )  rg3  TlTbVT  ij]^x  x(t)dt,  i  1,  .  .  .  ,  7TT.  +  1. 

I  i  —  1 1  '  I  i  I  i—1  i 
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And  then,  similar  to  the  penalty  method  in  classical  optimization  theory  [?],  we  approximate  (2.4) 
as  the  following  formula; 

miug  aTV°(x  —  g)  +  aiTV°((/)  +  a2TY'^{g), 

s.t.  fnx(x  —  g)dt  =  0  for  i  =  1, . . .  (2-7) 

9{^i )  ~  \Bf  ]^|+|i?f|  x(t)dt,  i  =  1, . . .  ,m  +1, 

where  ai,  02  and  Bf,  i  =  0,1, . . . ,  +  1  are  the  same  as  those  mentioned  in  (2.4),  and  a  is  a  large 

enough  positive  parameter  to  control  the  monotonicity  of  x  —  5  on  each  Bf .  This  minimization 
problem  can  be  discretized  as  an  Li  minimization  problem,  which  has  been  well  studied  in  the 
compressed  sensing  field  [34,  6,  2] . 

Suppose  the  signal  is  uniformly  sampled  at  tj,  i  =  1,  2, . . . ,  n,  and  Bf  =  =  1,2,...  m*. 

Then,  the  minimization  problem  (2.7)  can  be  discretized  as  the  following  form: 

min^gM"  a\\D^{x  -  5)l|i  +  o;i\\D^g\\i  + 

s.t.  A{x  —  g)  =  0  (2-8) 

Cg  =  b, 

where  A  =  {an)  G  denotes  the  piecewise  integral  operator,  whose  elements  are  defined  by 

an  =  1  when  ti  G  Bf,  and  an  =  0  otherwise,  f  =  1, . . . ,  rrY;  b  G  whose  elements  are  defined 

as  b{i)  =  lox  Jb^  ub^  x{t)dt,  i  =  1, . . .  ,m^  +  1;  C  =  (cn)  G  where  cn  =  1  when 

ti  =  tj^,  otherwise  cn  =  0,  i  =  1, . . .  ,rrY  +  1;  and  G  and  G  are  the 

1-order  and  (r  +  l)-order  differential  matrixes  respectively.  Moreover,  we  utilize  the  split  Bregman 
iterative  algorithm  discussed  recently  by  Goldstein  and  Osher  [34,  6,  2],  which  performs  well  for 
large-scale  data,  to  solve  the  above  Li  optimization  problem.  The  procedure  is  summarized  in 
Algorithm  1. 

The  computational  cost  of  Algorithm  1  contains  the  computation  of  g^,d\  and  d!^  in  the  split 
Bregman  iteration.  Let  n  be  the  length  of  the  signal.  The  g^  can  be  computed  by  using  the 


12 


Algorithm  1  (Optimal  Average) 

1.  initialization:  =  0,  {i  =  1,2),  =  0,  jji  >  0,  {i  =  1,2,  3, 4),  =  0. 

2.  while  not  converge,  do 

•  =  u\  +  A{x  —  g^),  =  U2  +  b  —  Cg^; 

•  u’^+^  =  «3  +  D^g’^  -  4,  =ul  +  D^+^g^  -  4; 

•  5^+1  ^min{f  P5-u5^+i||2  +  f  ||C7(7-«^+^||2  +  f  ||4-Di5-7x^+i||2  +  '^||4-D-+i(7- 

•  ^  min{a||D^x  —  di||i  +  ai||di||i  +  ^||di  — 

.  4+1  ^  min{a2N2||i  +  f  11^2  -  D-+^g’^+^  -  4+if }. 

3.  end  while. 

existing  optimization  technique  with  computational  complexity  0(n  log  n)  and  4;  can  be  solved 
by  the  soft  thresholding  procedure  respectively,  which  need  only  the  operations  of  vector  product 
and  scalar  contraction  with  a  complexity  20(n).  If  L  iterations  are  implemented,  then  the  total 
computational  complexity  is  L{0{n\ogn)  +  20{n))  =  L  ■  0{n\ogn). 

In  the  experiments  in  this  paper,  we  take  a  =  500,  oi  =  0.01,  0:2  =  0.99  in  problem  (2.8)  in 
general.  We  use  weights  a  =  500,  ai  =  0.5,  02  =  0.5  only  for  discontinuous  signals,  such  as  the  one 
in  Example  1.  Although  the  weights  are  given  in  advance  and  independent  of  the  original  signals, 
we  found  that  the  decomposition  results  are  insensitive  to  the  choices  of  the  parameters  in  our 
experiments.  For  instance,  we  tried  many  different  combinations  of  the  parameters  taking  values 
in  the  ranges  a  €  [100, 1000],  ai  G  [0.3,  0.7]  and  a2  =  1  —  ai  in  Example  1,  the  results  are  similar 
to  the  ones  that  we  depict  in  Figure  4.  The  same  case  is  true  for  Example  2  when  we  take  different 
parameter  values  a  G  [100, 1000],  ai  G  [0.01,0.2]  and  02  =  1  —  cq  respectively. 

On  the  other  hand,  we  put  no  restriction  on  the  parameter  r  except  for  r  G  N+  in  (2.4). 
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However,  different  choices  of  r  value  give  different  results.  For  example,  consider  a  signal  defined 


x{t)  =  < 


by, 

cos(47rt)  t  G  [0,2] 

6(t  —  2)  +  cos(87rt)  t  G  (2, 4], 

Figure  2  shows  the  averages  obtained  from  (2.8)  by  taking  r  =  1,2,  3, 4  respectively.  As  one  can 

see,  the  first  two  averages  are  quite  close  to  the  desirable  one.  But  when  use  r  =  3  or  4,  some  small 
waves  appear  in  the  averages,  and  this  is  not  ideal  for  our  purpose.  Hence,  choosing  a  reasonable 
T  is  important  to  the  average.  If  the  r  is  too  small,  the  average  will  lack  of  regularity.  But  if  it  is 
too  large,  it  will  bring  in  undulation  effects  as  well.  In  this  paper,  we  use  r  =  2,  which  seems  to  be 
the  robustest  among  all  the  cases. 


T=4 


Figure  2:  The  averages  generated  from  our  approach  with  different  r.  From  left  to  right  and  top  to  bottom, 
the  parameter  t  is  taken  as  1, 2,  3, 4  respectively. 


Let  us  look  back  to  the  toy  example  x{t)  =  6t  +  cos(87rt),  t  G  [0, 1]  mentioned  in  (2.2).  The  red 
curve  in  Figure  3  shows  the  average  obtained  from  our  approach.  Compare  it  with  the  two  plots 
depicted  in  Figure  I,  the  average  here  are  remarkably  improved,  and  very  close  to  the  one  we  want 
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shown  in  the  green  line. 


Figure  3:  Blue:  x,  red:  the  average  of  x  obtained  from  our  approach,  green:  the  average  we  are  looking  for. 


3  An  alternative  algorithm  to  generate  IMFs 

Since  our  average  is  based  on  optimization  model,  we  denote  it  as  OA  for  simplicity.  In  this  part, 
we  primarily  focus  on  using  OA  to  perform  an  alternative  algorithm  for  EMD.  It  is  presented  as 
follows. 

3.1  OA-based  data  decomposition  algorithm 

Given  a  signal  G  [a,b]  ,  let  s  =  x,  then  the  first  IMF  ci  can  be  extracted  by  the  following 

iteration. 

1)  Identify  all  the  extrema  of  s,  and  then  obtain  the  partition  {B^  :=  [a|:«l+i]}*='’i  based  on 
these  extreme  points. 

2)  Compute  the  average  Ms  by  solving  the  minimization  problem  (2.8).  Then  ci  =  s  —  Ms  is 
the  first  IMF. 

The  residue  ri  =  x  —  ci  is  treated  as  the  original  signal,  i.e.,  let  s  =  x  —  ci.  And  we  apply  it  to  the 
same  iteration  as  described  above  again,  the  other  IMFs  can  be  calculated  unless  the  residue  is  a 
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monotonic  function.  Finally,  we  obtain 

n 

x  =  '^Ci  +  rn.  (3.1) 

i=l 

3.2  Numerical  experiments 


In  this  section,  we  demonstrate  the  performance  of  the  new  alternative  approach  of  EMD  through 
a  series  of  examples,  which  cover  synthetic  and  real  data.  Prior  to  these,  we  first  examine  the 
performance  of  our  average  (OA)  approach  by  comparing  it  with  other  approaches  introduced  in 
[11,  22,  25].  Since  the  averages  presented  in  [11,  22,  25]  are  respectively  based  on  cubic  spline 
envelopes  (CSE),  a  segment  power-function  based  envelopes  (SPFE)  and  monotone  piecewise  cubic 
interpolation  based  envelopes  (IMCE).  We  denote  them  as  CSA,  SPFA  and  IMCA  respectively  in 
our  experiments  for  simplicity,  . 

Example  1:  We  first  consider  the  signal 


x{t)  =  { 


(3.2) 


cos(127rt)  t  G  [0, 1]  U  [2,  3] 

0.5 -|- 0.3  cos(87rt)  tG(l,2), 
which  is  composited  by  three  intermittent  signals.  Since  cos(127rt)  and  0.3cos(87rt)  are  monocom¬ 
ponent  signals,  the  ideal  average  of  x  should  be 


m{t) 


0  t  G  [0,1]  U  [2,3] 
0.5  tG(l,2). 


Due  to  the  discontinuity  of  x,  to  find  an  desirable  average  is  challenging.  This  is  reflected  in  Figure 
4.  The  left  side  is  the  profile  of  the  signal  x,  the  middle  figure  depicts  the  averages  obtained  by  the 
OA  (blue  line),  the  CSA  (red),  the  IMCA  (green  dash)  and  the  SPFA  (black  dash)  respectively. 
Moreover,  we  plot  the  errors  between  the  averages  and  the  ideal  one  m{t)  on  the  right  side.  It  is 
clear  that  the  OA  is  the  closest  to  rh{t).  Meanwhile,  we  can  use  the  following  root  mean  square 
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(RMS)  [26]  to  measure  the  error  between  an  average  m{t)  and  the  ideal  one  m(t): 


RMS(m,  rh)  = 


(3.3) 


Generally  speaking,  the  smaller  the  RMS,  the  better  the  average  is.  Back  to  this  example,  Table  1 
lists  the  RMSs  of  these  averages  with  rh{t).  It  is  also  seen  that  the  proposed  OA  is  the  best  among 
all  the  cases. 


Figure  4:  Left,  the  signal  x.  Middle,  red  solid:  the  CSA,  green  dash:  the  IMCA,  black  dash:  the  SPFA, 
blue  solid:  the  OA.  Right,  red  solid:  CSA  —  m,  green  dash:  IMCA  —  m,  black  dash:  SPFA  —  m,  blue  solid: 


OA  —  TO. 


Table  1 


Methods 

OA 

IMCA 

CSA 

SPFA 

RMSs  0.0504 

0.0586 

0.0595 

0.0523 

Example  2:  In  this  example,  we  consider  a  monocomponent  signal  x{t)  =  (2t+cost^)  cos(27rt+ 
0.2cost),  t  G  [0, 10].  As  we  know,  the  ideal  average  fh{t)  of  any  monocomponent  signal  should  be 
zero.  This  actually  can  not  be  attained  by  the  existing  approaches.  Figure  5  shows  the  profile  of 
the  signal  and  its  average  including  the  OA,  the  CSA,  the  IMCA  and  the  SPFA.  Clearly,  we  can 
also  see  that  the  OA  is  the  closest  to  m(t)  from  the  right  of  Figure  5.  Meanwhile,  Table  2  shows 
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the  RMSs  of  the  averages.  These  all  indicate  the  OA  is  best  among  them. 


Figure  5:  Left,  the  signal  x.  Right,  red  solid:  the  CSA,  green  dash:  the  IMCA,  black  dash:  the  SPFA,  blue 
solid:  the  OA,  blue  dash:  the  ideal  average  fa. 


Table  2 


Methods 

OA 

IMCA 

CSA 

SPFA 

RMSs  0.1715 

0.6168 

0.6094 

0.6210 

In  the  following  examples,  we  test  our  optimal  average  method  on  several  signals  that  may 
be  non-stationary,  or  polluted  by  noise,  or  from  the  real  world  applications.  We  also  compare 
our  method  with  some  existing  approaches,  such  as  the  M-LFBF  algorithm  introduced  in  [9], 
and  emsamble  EMD  (EEMD)  proposed  in[12,  13].  In  addition,  we  calculate  the  normalized  inner 
product  of  two  components  ci  and  C2  to  measure  the  angle  between  them,  i.e. 

^  I(C1,C2)| 

/5(ci,C2)  =  - r^. 

IIC1II2IIC2II2 

It  is  clear  that  the  smaller  values  of  /o(ci,  C2)  indicate  the  components  are  near  orthogonal.  We  want 
to  remind  that  M-LFBF  enforces  the  near  orthogonality  in  the  optimization  problem  specifically, 
while  we  do  not  in  our  optimal  average  algorithms. 

Example  3:  We  consider  the  signal  composed  by  a  couple  of  orthogonal  components  x{t)  = 
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cos(27rt)  +  sin(87ri).  The  left  of  Figure  6  plots  the  original  and  the  results  obtained  from  our 


approach.  We  use  the  M-LFBF  algorithm  to  decompose  the  same  signal,  and  its  results  are  shown 
in  the  right  of  Figure  6.  The  normalized  inner  products  between  the  IMFs  obtained  from  our 
approach  and  the  M-LFBF  respectively  are  computed  and  listed  in  Table  3.  The  numbers  show 
that  the  two  IMFs  obtained  by  our  method  are  also  near  orthogonal. 


Figure  6:  The  original  and  IMFs  in  Example  3.  Left:  our  approach;  Right:  the  M-LFBF  method. 


Table  3:  Performance  of  the  orthogonality  in  Example  3. 


p  value 

Our  approach 

imfl  imf2 

p  value 

M-LFBF 

imfl  imf2 

imfl 

1 

0.0661 

imfl 

1 

0.0354 

imf2 

0.0661 

1 

imf2 

0.0354 

1 

Example  4:  We  test  a  non-stationary  signal  x{t)  =  cos(t)  -|-cos(t^  -|-t-|-cos(t)),  which  is  formed 
by  two  non-orthogonal  components.  Using  our  approach  yields  essentially  a  perfect  decomposition 
with  two  IMFs  and  a  residue  that  is  close  to  zero,  see  the  left  of  Figure  7.  On  the  right  of  Figure 
7,  we  depict  the  results  from  the  M-LFBF  algorithm.  Table  4  lists  the  normalized  inner  products 
between  the  components  obtained  by  these  two  methods  respectively.  This  example  demonstrates 


19 


that  it  is  not  ideal  to  enforce  the  orthogonality  in  the  decompositions  if  the  signal  consists  of 


non-orthogonal  intrinsic  components. 


Figure  7:  The  original  signal,  IMFs  and  the  residue  of  Example  4.  Left:  our  approach;  Right:  the  M-LFBF 
method. 


Table  4:  Performance  of  the  orthogonality  in  Example  4. 


p  value 

Our  approach 

imfl  imf2  res 

p  value 

imfl 

M-LFBF 

imf2  imf3 

res 

imfl 

1 

0.0316 

0.0195 

imfl 

1 

0.0375 

0.0058 

0.0212 

imf2 

0.0316 

1 

0.0665 

imf2 

0.0375 

1 

0.0318 

0.2565 

res 

0.0195 

0.0665 

1 

imf3 

0.0058 

0.0318 

1 

0.0055 

res 

0.0212 

0.2565 

0.0055 

1 

Example  5:  In  this  example,  we  test  our  approach  on  signal  polluted  by  Gaussian  noise.  Here 
the  signal  is  given  by 

x{t)  =  0.6t  +  sin(t)  +  sin(3t)  +  e(t),  (3.4) 

where  e{i)  is  a  Gaussian  noise.  Our  method  generates  four  IMFs  and  a  trend,  which  are  depicted 
in  the  top  of  Figure  8.  The  three  plots  on  the  left  correspond  to  the  polluted  signal  and  noise 
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components  respectively,  the  other  three  components  on  the  right  are  very  close  to  the  intrinsic 
components  of  x.  We  also  use  the  EEMD  ^  technique  [12,  13]  to  decompose  the  signal  and  show 
its  results  in  the  bottom  of  Figure  8.  Table  5  lists  the  normalized  inner  products  between  each  two 
components  of  our  method,  it  also  shows  our  method  can  generate  near-orthogonal  components  in 
this  case. 


Figure  8;  Top:  the  figures  are  corresponding  to  the  original  signal  and  its  components  obtained  from  our 
approach  in  Example  5.  Bottom:  these  are  the  original  signal  and  its  components  obtained  from  EEMD. 


Example  6:  In  the  previous  examples,  we  use  several  synthetic  signals  to  test  our  approach. 

Here,  we  consider  the  length  of  the  day  (LOD)  data,  which  covers  700  consecutive  days  starting 

from  1978.  The  original  signal  and  the  IMFs  derived  from  our  approach  are  plotted  in  Figure 

9.  Like  the  EMD  approach,  our  method  can  recover  almost  all  the  physically  meaningful  IMFs. 

Clearly,  in  Figure  9,  the  first  IMF  captures  the  semi  monthly  cycles  while  the  second  IMF  indicates 

the  monthly  oscillations.  Similarly,  the  third  IMF  captures  the  semi  annual  cycle  and  the  last  IMF 

captures  the  annual  tendency.  Table  6  indicates  the  angles  between  the  components  obtained  by 
^We  use  the  code  made  by  M.  E.  Torres  et.  al.,  it  is  available  at  http://perso.ens-lyon.fr/patrick.flandrin/emd.html. 
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our  method. 


Figure  9:  The  IMFs  and  trend  of  LOD  data  in  Example  6. 


4  Conclusion 

The  EMD  has  found  many  applications  in  a  variety  of  problems  covering  signal  processing,  pattern 
recognition,  biomedicine  engineering  and  ocean  science  since  its  advent.  Although  it  often  proves 
remarkably  effective,  a  mathematical  foundation  is  virtually  nonexistent.  The  difficulty  is  partly 
due  to  the  average  based  on  cubic  spline.  In  view  of  this,  we  present  a  novel  adaptive  and  local 
average  based  on  local  optimization  to  overcome  this  problem.  The  improvements  in  our  approach 


Table  5:  Performance  of  the  orthogonality  in  Example  5. 


p  value 

imfl 

IMFs  and  residue 

imf2  imf3  imf4 

res 

imfl 

1 

0.1789 

0.0070 

0.0040 

0.0073 

imf2 

0.1789 

1 

0.3219 

0.0408 

6.1486  X  10"^ 

imf3 

0.0070 

0.3219 

1 

0.1078 

0.0016 

imf4 

0.0040 

0.0408 

0.1078 

1 

0.0466 

res 

0.0073 

6.1486  X  10"^ 

0.0016 

0.0466 

1 
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can  be  summarized  as  follows.  First,  our  approach  does  not  use  any  pre-defined  functions,  such 


as  AM-FM  functions  or  wavelets  as  the  basis  for  IMFs,  this  makes  our  approach  share  similar 
flexibilities  with  the  original  EMD  method.  And  second,  the  objective  and  the  constraint  conditions 
are  defined  on  the  monotonic  intervals  of  the  signal,  this  can  avoid  the  global  influence  and  then 
makes  our  approach  robust  to  the  noise  perturbations.  We  also  proved  that  the  IMFs  obtained  by 
our  alternative  approa  ch  satisfy  the  two  requirements  proposed  by  Huang  et.  el.  in  the  original 
EMD  method.  Moreover,  the  existing  theory  on  total  variation  minimization  provides  convergence 
guarantee  for  our  iterations.  In  addition,  the  experiments  show  very  encouraging  results,  and  we 
hope  that  this  alternative  approach  can  lead  to  more  stable  performance  in  applications,  which  will 
be  considered  in  our  future  research  studies. 
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